Solving POMDPs by Searching in Policy Space

نویسنده

  • Eric A. Hansen
چکیده

Most algorithms for solving POMDPs itera­ tively improve a value function that implic­ itly represents a policy and are said to search in value function space. This paper presents an approach to solving POMDPs that repre­ sents a policy explicitly as a finite-state con­ troller and iteratively improves the controller by search in policy space. Two related al­ gorithms illustrate this approach. The first is a policy iteration algorithm that can out­ perform value iteration in solving infinite­ horizon POMDPs. It provides the founda­ tion for a new heuristic search algorithm that promises further speedup by focusing compu­ tational effort on regions of the problem space that are reachable, or likely to be reached, from a start state.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving POMDPs by Searching the Space of Finite Policies

Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size. This problem is also intractable, but we show that the complexity can be ...

متن کامل

Producing efficient error-bounded solutions for transition independent decentralized mdps

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...

متن کامل

PEGASUS: A policy search method for large MDPs and POMDPs

We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a model. Our approach is based on the following observation: Any (PO)MDP can be transformed into an “equivalent” POMDP in which all state transitions (given the current state and action) are deterministic. This reduces the...

متن کامل

Point-based policy generation for decentralized POMDPs

Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate poli...

متن کامل

Implementation Techniques for Solving POMDPs in Personal Assistant Domains

Agents or agent teams deployed to assist humans often face the challenges of monitoring the state of key processes in their environment (including the state of their human users themselves) and making periodic decisions based on such monitoring. POMDPs appear well suited to enable agents to address these challenges, given the uncertain environment and cost of actions, but optimal policy generat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998